Construction of Classifiers by Iterative Compositions of Features with Partial Knowledge

نویسندگان

  • Kazuya Haraguchi
  • Toshihide Ibaraki
چکیده

We consider the classification problem to construct a classifier c : {0, 1}n → {0, 1} from a given set of examples (training set), which (approximately) realizes the hidden oracle y : {0, 1}n → {0, 1} describing the phenomenon under consideration. For this problem, a number of approaches are already known in computational learning theory; e.g., decision trees, support vector machines (SVM), and iteratively composed features (ICF). The last one, ICF, was proposed in our previous work (Haraguchi et al., (2004)). A feature, composed of a nonempty subset S of other features (including the original data attributes), is a Boolean function fS : {0, 1}S → {0, 1} and is constructed according to the proposed rule. The ICF algorithm iterates generation and selection processes of features, and finally adopts one of the generated features as the classifier, where the generation process may be considered as embodying the idea of boosting, since new features are generated from the available features. In this paper, we generalize a feature to an extended Boolean function fS : {0, 1, ∗}S → {0, 1, ∗} to allow partial knowledge, where ∗ denotes the state of uncertainty. We then propose the algorithm ICF∗ to generate such generalized features. The selection process of ICF∗ is also different from that of ICF, in that features are selected so as to cover the entire training set. Our computational experiments indicate that ICF∗ is better than ICF in terms of both classification performance and computation time. Also, it is competitive with other representative learning algorithms such as decision trees and SVM. key words: classification, Boolean functions, partially defined Boolean functions, learning algorithms, iteratively composed features

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

دسته‌بندی پرسش‌ها با استفاده از ترکیب دسته‌بندها

Question answering systems are produced and developed to provide exact answers to the question posted in natural language. One of the most important parts of question answering systems is question classification. The purpose of question classification is predicting the kind of answer needed for the question in natural language. The  literature works can be categorized as rule-based and learning...

متن کامل

Iterative Construction of Hierarchical Classifiers for Phishing Website Detection

This article is devoted to a new iterative construction of hierarchical classifiers in SimpleCLI for the detection of phishing websites. Our new construction of hierarchical systems creates ensembles of ensembles in SimpleCLI by iteratively linking a top-level ensemble to another middle-level ensemble instead of a base classifier so that the top-level ensemble can generate a large multilevel sy...

متن کامل

Fault Detection of Bearings Using a Rule-based Classifier Ensemble and Genetic Algorithm

This paper proposes a reduct construction method based on discernibility matrix simplification. The method works with genetic algorithm. To identify potential problems and prevent complete failure of bearings, a new method based on rule-based classifier ensemble is presented. Genetic algorithm is used for feature reduction. The generated rules of the reducts are used to build the candidate base...

متن کامل

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

Classifier Ensemble Framework: a Diversity Based Approach

Pattern recognition systems are widely used in a host of different fields. Due to some reasons such as lack of knowledge about a method based on which the best classifier is detected for any arbitrary problem, and thanks to significant improvement in accuracy, researchers turn to ensemble methods in almost every task of pattern recognition. Classification as a major task in pattern recognition,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEICE Transactions

دوره 89-A  شماره 

صفحات  -

تاریخ انتشار 2006